The objective of Experiment 1 was to establish baseline PSE response patterns in younger adults, and examine how PSE magnitudes differ across the three response-option conditions (RF-judgments / RFB-judgments / RF-Ratings). A 2x3 mixed factorial design was utilised, consisting of a within-subjects variable of stimulus type (words / grey-scale line-drawings) and a between-subjects variable of response-option (RF-judgments / RFBG-judgments / RF-Ratings).
A total of 186 subjects completed the online experiment: 122 females (M age = 26.02 years, SD = 10.04), 60 males (M = 28.1 years, SD = 10.98), and 2 non-binary (M = 19.5 years, SD = 2.121). This sample size is typical in the literature for a medium effect size (.25), however, due to mixed conclusions regarding the direction of R + F, it was difficult to determine an exact a-priori number of subjects.
To meet our YA requirements, all participants were required to be between 18-59 years of age (actual range: 18-59). As our experiment involved English word stimuli, we also asked subjects whether English was their first language; the vast majority (93.01%) reported that English was indeed their first language.
The current sample was primarily comprised of participants sourced from voluntary participation websites such as (52.15%), where payment at the rate of £5/hr was given, and via the in-school (41.4%), where they received course participation credits. A small number of participants were also recruited from social media and other online sources (Facebook: 3.76%; Call For Participants: 1.61%; Reddit: 0.54%; unspecified: 0.54%).
A random pool of grey-scale line-drawings were sourced from Rossion & Pourtois (2004), along with their corresponding written-word object names (see Appendix XX). All items were imported into Photoshop CC (20.0.04 Release), where they were converted from their native .bmp file format into .png files. The corresponding written-word items were also created in Photoshop using the Calibri sans-serif typeface and again exported as .pngs. All items were automatically resized and presented at 250x250px by the online survey platform.
Data collection was conducted via the online survey platform . Participants initially completed an encoding phase, where target stimuli (word and picture stimuli) were randomly presented one-at-a-time on-screen. To ensure subjects’ directed their attention toward stimuli during the study phase, we utilised a simple encoding question at study: “Is this a picture or a word?”. All participants showed a response accuracy of +>90% at study, indicating a high rate of attention toward the presented stimuli. This ability to determine whether participants were concentrating at study is why a blocked design was avoided at study (i.e. separate word / picture blocks); separate blocks for each stimuli-format would make our current encoding question redundant, and whilst we could have used a different question (e.g. “Is this item pleasant?”), we were keen to avoid any aforementioned levels-of-processing effects, whereby deep processing questions may result in ceiling effects for responses made toward picture stimuli.
The encoding phase was followed by a short distractor task, whereby participants completed 20 multiplication sums. This was followed by the recognition task, where subjects were again randomly presented with word and picture items on-at-a-time on-screen, and were required to respond “Old”/“New” depending on whether the recognised the item or not. “Old” responses were succeeded by a follow-up screen whereby participants were asked to report their recognition experience for the current item; the response-options available during this follow-up response page differed between participants, with random allocation into either the RFG, RFBG, or RF-Ratings response-option conditions. Instructions for the memory test were highly similar between conditions, deviating only to explain key changes such as: additional response options (i.e. “Both”, in the RFBG condition), and how certain responses should be reported (i.e. 0 on both scales to represent “Guessing” in the RF-Ratings condition).
DVs consisted of: i) the proportion of hits; ii) the proportion of FAs; iii) overall recognition (proportion of overall hits minus proportion of FAs), iv) d’ (d-prime; discrimination); v) c (response-bias criterion). Proportions of hits and FAs were also calculated separately based upon follow-up recognition judgments: Recollection, Familiarity, and Guessing. To achieve comparable proportions across each of the response-option conditions - in order to examine the PSE across the different groups - each group had slightly different proportion calculations. For the RFG group, simple proportions were used (i.e. what proportion of hits / FAs were assigned Recollection, Familiarity, and Guessing). For the RFBG group, we separately added the proportion of “Both” to Recollection and Familiarity proportions, since it was defined as representing both processes simultaneously. For the RF-Ratings group, proportions were calculated based on the number of responses above the threshold of+>3; a response was classified as “Recollection” when subjects rated between 3-5 on that particular scale (any Familiarity rating was permissible), and a response was classified as “Familiarity” when subjects rated between 3-5 on that particular scale (any Recollection rating was permissible). Guessing responses were calculated as the proportion of responses where participants rated 0 on both scales toward a single item (participants were instructed to respond in this way if their response was a rec).
Note: should we have a performance cut off? Some people have very low performance (like 0 or 1 hits) and should probably be excluded?
Separate 2 (stimuli format: words, pictures) x 3 (response-option condition: RFG-judgments, RFBG-judgments, RF-ratings) mixed ANOVAs were conducted on the mean proportion of hits and false alarms (FAs; see Table 1).
Table 1: Mean proportion of hits and FAs by stimuli-format and response-option condition.
In the ANOVA on the proportion of hits, there was a significant main effect of stimuli-format, \(F(1, 183) = 131.77\), \(\mathit{MSE} = 0.01\), \(p < .001\); pictures (\(M = 0.62\)) produced a higher number of hits than words (\(M = 0.47\)), , supporting the notion that stimuli-format plays a role in item recognition. There was also a significant main effect of response-option condition, \(F(2, 183) = 6.46\), \(\mathit{MSE} = 0.09\), \(p = .002\), with the RFG group (\(M = 0.62\)) showing more hits compared to the RF-Ratings group (\(M = 0.48\)), , suggesting fewer available response-options may facilitate accurate recognition. There were no significant interaction effects, \(F(2, 183) = 0.35\), \(\mathit{MSE} = 0.01\), \(p = .707\). Click here to see output: [Appendix A - ANOVA Prop hits][].
The ANOVA on the proportion of FAs also highlights the role of stimuli-format on the memorability of items, with a significant main effect of stimuli-format \(F(1, 183) = 61.18\), \(\mathit{MSE} = 0.01\), \(p < .001\) showing words (\(M = 0.21\)) produced more FAs than pictures (\(M = 0.12\)), . There was no significant main effect of response-option condition, \(F(2, 183) = 2.70\), \(\mathit{MSE} = 0.03\), \(p = .070\), or significant interaction, \(F(2, 183) = 0.51\), \(\mathit{MSE} = 0.01\), \(p = .603\). Click here to see output: [Appendix B - ANOVA Prop FAs][].
A similar pattern was again reflected in the ANOVA on overall performance accuracy; a significant main effect of stimuli-format \(F(1, 183) = 409.20\), \(\mathit{MSE} = 0.01\), \(p < .001\) revealed that pictures (\(M = 0.50\)) showed an increased overall task performance accuracy compared with words (\(M = 0.27\)), . There was no significant main effect of response-option condition, \(F(2, 183) = 2.85\), \(\mathit{MSE} = 0.08\), \(p = .060\), or significant interaction, \(F(2, 183) = 0.33\), \(\mathit{MSE} = 0.01\), \(p = .720\). Click here to see output: [Appendix C - ANOVA Overall recognition][].
To assess the roles of discrimination and response bias, separate 2 (stimuli format: words, pictures) x 3 (response-option condition: RFG-judgments, RFBG-judgments, RF-ratings) mixed ANOVAs were conducted on d’ (d-prime; measure of sensitivity) and c-scores (decision criterion; see Table 2).
Table 7:
The ANOVA on d’ scores was consistent with the hits, FAs, and overall recognition findings; there was a significant main effect of stimuli-format, \(F(1, 183) = 295.80\), \(\mathit{MSE} = 0.18\), \(p < .001\) showing pictures (\(M = 1.62\)) facilitated better discrimination between hits and FAs than words (\(M = 0.86\)), . There was again no significant main effect of response-option condition, \(F(2, 183) = 1.53\), \(\mathit{MSE} = 0.84\), \(p = .219\), or significant interaction, \(F(2, 183) = 0.25\), \(\mathit{MSE} = 0.18\), \(p = .778\). Click here to see output: [Appendix D - ANOVA d’ (discrimination)][].
In the ANOVA on c-scores, there was no significant main effect of stimuli-format, \(F(1, 183) = 2.31\), \(\mathit{MSE} = 0.11\), \(p = .130\), indicating response bias was unaffected regardless of whether participants were responding to words or pictures. There was, however, a significant main effect of response-option condition, \(F(2, 183) = 6.44\), \(\mathit{MSE} = 0.51\), \(p = .002\); those in the RF-Ratings condition (\(M = 0.67\)) showed higher c-scores (and thus a more conservative response bias) than those in the RFG condition (\(M = 0.34\)), , suggesting participants were less likely to respond “Old” when they were required to provide more detailed follow-up recognition judgments (Recollection: 0-5 / Familiarity: 0-5), compared to simply selecting one of three options (R,F, or G). There were no significant interaction effects, \(F(2, 183) = 0.14\), \(\mathit{MSE} = 0.11\), \(p = .869\). Click here to see output: [Appendix E - ANOVA c (bias)][].
To determine the effects of stimuli-format and response-option on accurate recognition memory judgments, separate 2 (stimuli format: words, pictures) x 3 (response-option condition: RFG-judgments, RFBG-judgments, RF-ratings) mixed ANOVAs were conducted on the mean proportion of hits assigned Recollection, Familiarity, and Guessing (see Figure 1).
Figure 1: Proportion of hits assigned Recollection, Familiarity, and Guessing, by stimuli-format and response-option condition.
The ANOVA on Recollected hits revealed a significant main effect of stimuli-format, \(F(1, 178) = 158.42\), \(\mathit{MSE} = 0.03\), \(p < .001\), with pictures (\(M = 0.67\)) showing a higher proportion of Recollected hits than words (\(M = 0.45\)), - consistent with the previous findings to suggest an enhanced memorability in pictures compared to words. There was also a significant main effect of response-option, \(F(2, 178) = 8.55\), \(\mathit{MSE} = 0.09\), \(p < .001\); the RF-Ratings group (\(M = 0.65\)) showed significantly more Recollected hits compared to both the RFG-group (\(M = 0.49\)), , and the RFBG-group (\(M = 0.54\)), . There were no significant interaction effects, \(F(2, 178) = 1.91\), \(\mathit{MSE} = 0.03\), \(p = .151\). Click here to see output: [Appendix F - ANOVA Rec hits][].
In the ANOVA on Familiarity hits, there was again a significant main effect of stimuli-format, \(F(1, 178) = 4.65\), \(\mathit{MSE} = 0.03\), \(p = .032\); words (\(M = 0.56\)) resulted in a higher proportion of familiarity hits than pictures (\(M = 0.52\)), , indicating that - when correctly recognised - words were not recognised in the the same context-rich nature as pictures. There was also a significant main effect of response-option, \(F(2, 178) = 34.52\), \(\mathit{MSE} = 0.09\), \(p < .001\), with the RF-ratings group (\(M = 0.71\)) showing more Familiarity hits than both the RFG-group (\(M = 0.38\)), , and the RFBG-group (\(M = 0.52\)), . The RFBG-group (\(M = 0.52\)) also showed significantly more Familiarity hits compared with the RFG-group (\(M = 0.38\)),
There was also a significant interaction, \(F(2, 178) = 34.42\), \(\mathit{MSE} = 0.03\), \(p < .001\) (see Figure 2). Within response-option conditions, words resulted in more Familiarity hits than pictures in both the RFG group (words: \(M = 0.48\); pictures: \(M = 0.29\)), \(t(178) = 6.07\), \(p < .001\); and RFBG group (words: \(M = 0.57\); pictures: \(M = 0.48\)), \(t(178) = 2.87\), \(p = .005\). Conversely, the RF-Ratings group showed the opposite pattern, with more Familiarity hits produced for pictures (\(M = 0.79\)) than words (\(M = 0.63\)), \(t(178) = -5.29\), \(p < .001\).
Between response-option conditions, word stimuli produced significantly more Familiarity hits in the RF-Ratings group (\(M = 0.63\)) compared tothe RFG group (\(M = 0.48\)), \(t(276.78) = -3.37\), \(p = .002\). For pictures, a higher number of Familiarity hits was evident in the RF-Ratings group (\(M = 0.79\)) compared to both the RFG group (\(M = 0.29\)), \(t(276.78) = -11.13\), \(p < .001\) and RFBG group (\(M = 0.48\)), \(t(276.78) = -6.94\), \(p < .001\). The RFBG group (\(M = 0.48\)) also showed a significantly higher number of Familiarity hits compared to the RFG group (\(M = 0.29\)), \(t(276.78) = -4.24\), \(p < .001\). Click here to see output: [Appendix G - ANOVA Fam hits][].
Figure 2: Interaction plot between stimuli-format and response-option condition for the mean proportion of hits assigned ‘Familiarity’.
The ANOVA on Guessing hits again showed a significant main effect of stimuli-format, \(F(1, 178) = 42.84\), \(\mathit{MSE} = 0.01\), \(p < .001\), with words (\(M = 0.13\)) showing more Guessing hits than pictures (\(M = 0.06\)), - a finding that aligns with the results for hits assigned Recollection and Familiarity, whereby words appear to not be recognised in the the same context-rich nature as pictures. There was also a significant main effect of response-option, \(F(2, 178) = 14.97\), \(\mathit{MSE} = 0.03\), \(p < .001\), with the RF-Ratings group (\(M = 0.03\)) showing significantly fewer Guessing hits than both the RFG (\(M = 0.13\)), , and RFBG (\(M = 0.13\)) groups, . Despite ‘Guessing’ responses being permissable in any of the response-option groups, it seems when two independent rating scales were required participants were less likely to report a “Guess”. This could be because those in the RF-Ratings condition selected “New” more often than “Old” when they were having a complete guess, or instead, subjects might have opted to report lower levels of Recollection and Familiarity ratings (i.e., 1-3), rather than responding 0 on both scales.
There was also a significant interaction,\(F(2, 178) = 4.17\), \(\mathit{MSE} = 0.01\), \(p = .017\), (see Figure 3). Within response-option conditions, words resulted in more Guessing hits than pictures in both the RFG group (words: \(M = 0.17\); pictures: \(M = 0.08\)), \(t(178) = 5.38\), \(p < .001\); and RFBG group (words: \(M = 0.16\); pictures: \(M = 0.09\)), \(t(178) = 4.42\), \(p < .001\).
Between response-option conditions, word stimuli produced significantly fewer Guessing hits in the RF-Ratings group (\(M = 0.04\)) compared to both the RFG group (\(M = 0.17\)), \(t(281.42) = 5.44\), \(p < .001\), and the RFBG group (\(M = 0.16\)), \(t(281.42) = 5.18\), \(p < .001\). A similar pattern was also evident for pictures, with significantlyfewer of Guessing hits in the RF-Ratings group (\(M = 0.01\)) compared to both the RFG group (\(M = 0.08\)), \(t(281.42) = 2.70\), \(p = .020\) and RFBG group (\(M = 0.09\)), \(t(281.42) = 3.15\), \(p = .005\). Click here to see output: [Appendix H - ANOVA guess hits][].
Figure 3: Interaction plot between stimuli-format and response-option condition for the mean proportion of hits assigned ‘Guessing’.
To determine the effects of stimuli-format and response-option on false recognition memory judgments, separate 2 (stimuli format: words, pictures) x 3 (response-option condition: RFG-judgments, RFBG-judgments, RF-ratings) mixed ANOVAs were conducted on the mean proportion of FAs assigned Recollection, Familiarity, and Guessing (see Figure 4).
Figure 4: Proportion of FAs assigned Recollection, Familiarity, and Guessing, by stimuli-format and response-option condition.
In the ANOVA on Recollection FAs, there was no significant main effect of stimuli-format, \(F(1, 136) = 2.78\), \(\mathit{MSE} = 0.04\), \(p = .098\), but there was a significant main effect of response-option, \(F(2, 136) = 10.70\), \(\mathit{MSE} = 0.12\), \(p < .001\), with those in the RF-ratings group (\(M = 0.38\)) showing more Recollection FAs than both the RFG-group (\(M = 0.14\)), , and the RFBG-group (\(M = 0.24\)), ).
This could indicate that when subjects were required to individually report Recollection and Familiarity on two 0-5 rating scales, they were more likely to experience (or report) false recognition (accompanied by non-existent contextual details) than when only three (RFG) or four (RFBG) response-options were provided. There were no significant interaction effects, \(F(2, 136) = 2.08\), \(\mathit{MSE} = 0.04\), \(p = .129\). Click here to see output: [Appendix I - ANOVA Rec FA][].
The ANOVA on Familiarity FAs did not yield any significant results; with no significant main effect of stimuli-format, \(F(1, 136) = 1.12\), \(\mathit{MSE} = 0.04\), \(p = .292\), no significant main effect of response-option, \(F(2, 136) = 0.62\), \(\mathit{MSE} = 0.15\), \(p = .539\), and no significant interaction effects, \(F(2, 136) = 1.12\), \(\mathit{MSE} = 0.04\), \(p = .331\). Click here to see output: [Appendix J - ANOVA Fam FA][].
Finally, the ANOVA on Guessing FAs also showed no significant main effect of stimuli-format, \(F(1, 136) = 0.51\), \(\mathit{MSE} = 0.04\), \(p = .476\), but there was a significant main effect of response-option, \(F(2, 136) = 15.69\), \(\mathit{MSE} = 0.11\), \(p < .001\), with the with the RF-ratings group (\(M = 0.08\)) showing significantly fewer Guessing FAs than both the RFG (\(M = 0.35\)), , and RFBG groups (\(M = 0.30\)), . This again aligns with previous results, suggesting that those in the RF-Ratings group were less likely to report guesses than the other groups, whether accurate or not. There were no significant interaction effects, \(F(2, 136) = 0.07\), \(\mathit{MSE} = 0.04\), \(p = .935\). Click here to see output: [Appendix K - ANOVA Guess FA][].
The Picture Superiority Effect (PSE) is a highly robust and replicable phenomenon. In recognition memory paradigms, the PSE has been shown to manifest as both increased recollection and familiarity (Dewhurst & Conway, 1994; Rajaram, 1996, 1993; Wagner et al., 1997; Yonelinas, 2002). The effect is present in children, adolescents and healthy older adults (Whitehouse et al., 2006), though perhaps more striking is the fact that patients with Alzheimer’s disease or those presenting early isolated memory impairments, known as amnestic mild cognitive impairment (aMCI), also show memorial benefits toward pictures (Ally, 2012). This is supported by ERP studies demonstrating comparable enhancements to recollection-based ERP components between healthy older and aMCI groups when pictures, rather than words, are utilised (Ally et al., 2009). There is debate within the literature attempting to characterise the nature of memory deficits in aMCI, whereby despite general agreement that recollection processes are impaired in such individuals, findings show great inconsistency with regard to familiarity (Algarabel et al., 2012; Belleville, 2011; Pitarque, 2016; Wolk et al., 2011, 2013). The PSE may have been largely overlooked as an area for further research in an effort to help settle this debate, despite recent reviews highlighting methodological differences across studies as the potential source of inconsistent findings (Koen & Yonelinas, 2014; Migo et al., 2012; Schoemaker et al., 2014). The level at which stimuli distinctiveness impacts successful recognition is currently unclear, and there is little consistency across studies with regard to what is considered a ‘picture’.
Many experiments utilise illustrations for their picture stimuli (van der Meulen et al., 2012; Westerberg et al., 2013; Wolk et al., 2011), with a standardised set of items published by Snodgrass & Vanderwart (1980) among the most-used illustrated picture stimuli within the domain of memory research (Bermúdez-Margaretto et al., 2018; Deason et al., 2015; Hockley, 2008; Martins & Lloyd-Jones, 2006; McBride & Anne Dosher, 2002; Meade et al., 2019; Schmitter-Edgecombe et al., 2009; van der Meulen et al., 2012; Wagner et al., 1997; Wammes et al., 2016; Weldon et al., 1989; Weldon & Roediger, 1987; Whitehouse et al., 2006). The set consists of 260 line drawings of common, everyday objects (in black ink), along with their written word counterpart (e.g. “shoe”). Items were selected on the basis of exemplifying a number of semantic categories, including animals, furniture, fruit, etc., and a range of normative data was collected for each item; indices of naming agreement, mental imagery agreement, visual complexity, and familiarity were all recorded for each drawing. The normative data for the Snodgrass & Vanderwart (1980) items has been continually revisited, with a number of studies gathering culturally-appropriate norms (e.g. in Spanish (Sanfeliu & Fernandez, 1996), Chinese (Yoon et al., 2004), and Russian (Tsaparina et al., 2011), and additional testing of the relationship between reaction time and naming agreement (Székely et al., 2003). There are multiple theories of object recognition; the recognition-by-components theory proposed by Biederman (1987) identifies shape as the most crucial factor for successful recognition, in which case, the object outlines found in the set by Snodgrass & Vanderwart (1980) should be more than sufficient for experimental cognitive research. Other theories, however, posit that surface details such as colour and texture are just as crucial in forming object representations (Tanaka et al., 2001; Tarr & Bülthoff, 1998). The wide-ranging applicability of the Snodgrass & Vanderwart (1980) items throughout a number of cognitive disciplines has led to a more recent revision of the items by Rossion & Pourtois (2004). This revision consists of the exact same objects, digitally re-drawn to include surface textures and shading. Additionally, this set provides greyscale and colour versions for all items, as opposed to the greyscale-only items found in the Snodgrass & Vanderwart (1980) set (see Figure 8 for example items contained in the Snodgrass & Vanderwart (1980) and Rossion & Pourtois (2004) stimuli sets). The Rossion & Pourtois (2004) revision now appears to be favoured over the original Snodgrass & Vanderwart (1980) set among many cognitive researchers (Rollins & Riggins, 2018, p. @ensor2019; Stenberg, 2006; Wolk et al., 2008), almost certainly attributable to the increased detail and ability to choose whether colour is a necessary condition.
Despite their widespread use, line drawings have been criticised for their relative simplicity and lack of realism (Viggiano et al. (2004)), with many researchers favouring the use of photographs as experimental stimuli (Embree et al., 2012; Pitarque, 2016; Troyer et al., 2016, 2012; Wang et al., 2013). Photographs of faces are especially useful in research examining emotion and face recognition (Barba, 1997; Bowen et al., 2019; Cui et al., 2016; Herzmann et al., 2018), though a number of common-object photograph sets have also emerged as ecological alternatives to line-drawn items (Adlington et al., 2009; Moreno-Martínez & Montoro, 2012; Viggiano et al., 2004). While the published sets of photographs are undoubtedly useful in a range of cognitive domains, they do not allow us to specifically examine stimuli format as a factor on its own, as the concepts depicted are unique to the set they derive from. In order to make such comparisons, and ensure any differences in performance (e.g. recognition memory ability) are indeed attributable to stimuli format, the objects depicted must be consistent across stimuli formats. The current study presents a new set of photographic stimuli that extend the set of words and drawings provided by Rossion & Pourtois (2004), wherein each of the concepts depicted has been carefully matched across formats. These new stimuli will be utilised throughout a number of planned recognition experiments that aim to systematically compare measures of recognition against different ‘levels’ of stimuli. The curation of a new set of photographs - carefully matched to other formats - allows investigation into whether picture superiority magnitudes are mediated by the format pictures are presented in. The inconsistent use of different formats across studies has previously made it difficult to reconcile effects obtained in response to drawings with those obtained in response to photographs - an inherent problem when concepts are not matched across format. Normative data for the new set of photographs is also presented, allowing others who also wish to use our photograph stimuli to filter items by measures of naming agreement, mental imagery agreement, familiarity, visual complexity, and colour diagnosticity.
A total of 374 subjects completed the online experiment (see Table 3 for a breakdown of the gender and age of the sample). This sample size provided 20 data points for each of the five response types, while also ensuring the experiment did not last too long for participants (approx 25-mins). Subjects were recruited from both voluntary participation websites such as (where they received payment at the rate of £5/hr), and via the in-school (where they received course participation credits).
Table 3: Gender and age (SD) of the current sample.
To meet our YA requirements, all participants were required to be aged between 18-59 years (actual obtained range: 18-59 years). As our experiment involved typing the English labels for a range of image stimuli, subjects were also asked whether English was their first language; all but one participant indicated that English was indeed their first language (99.73%).
A pool of 136 line drawings (Rossion & Pourtois, 2004) - depicting common, everyday objects - were brought forward from the previous experiment. These items (along with their written-word labels) would form two of the unique stimuli formats that would be used in future recognition experiments (words and drawings). In this study, the drawings from Rossion & Pourtois (2004) were simply used as a reference in the photograph matching process. Corresponding photographs were obtained online with the aim of depicting the everyday objects in a similar manner to the drawings. The inherent subjectivity involved in this process may have led to images that were not a reliable ‘match’ to the concepts they were selected to depict (for example, the photograph chosen to depict the concept “bottle” may inadvertently provoke the majority of participants to give the label “wine”, thus indicating that this particular photograph fails to accurately depict the intended concept). To address this issue, and ensure all photographs more objectively depict the same concepts as the line drawings, three different photograph variations were found for each everyday object, with the aim of taking the best ‘match’ forward. An emphasis was placed on variety across these variations, with the aim of obtaining at least one photograph that very closely resembled the line-drawn depiction, and another offering a more modern depiction. Some items were substituted due to unique restrictions that meant they could not easily be translated into photographic format (for example, the shapes “arrow” and “star” can not be represented similarly as photographs). Photo stimuli were obtained by searching open-source, copyright-free image websites (e.g. ; ) for photographs that depicted the same everyday objects as the line drawings (see Appendix B for the full list of image references).
Figure 5: Examples of matching pictures across Snodgrass & Vanderwart (1980), Rossion & Pourtois (2004), and photographs from the current study. Greyscale versions of the drawings and photographs are not presented in this example.
The matching process produced a total of 408 unique photographs. All were imported into Adobe Photoshop (20.0.04 Release), where the background was removed to isolate the object of interest from other potentially distracting visual details. This was completed manually using the magnetic lasso and polygonal lasso tools (edges were either feathered by 1px or left un-feathered). The orientation of isolated objects was adjusted to ensure they matched as closely as possible with their line-drawn counterpart (e.g. all photograph variations of the item ‘boot’ were adjusted so the toe was facing left and the heel facing right, as in the line drawing); this was often achieved by flipping or mirroring the object to ‘correct’ the direction.
Despite isolating objects from their background, a small number of photographs still contained irrelevant and potentially distracting details. For example, in one photograph variation of the item ‘piano’, there was a sign on the object that may have impacted how the item was named or rated. Such details were removed as best as possible using the clone stamp and content-aware fill tools. Any obvious text (e.g. brand names) and numbers were also removed from photographs using the same method (see Figure 6). The primary aim of the current study was to obtain photographs that could be clearly distinguished as a unique stimuli format among words and line drawings; it is conceivable that combining these formats (i.e. inadvertently including photographs that also contain written words) might affect recognition performance in ways that are not directly comparable to items defined only by a single category. Any text in our photographs was therefore removed, apart from a couple of exceptions whereby such details happened to be integral to the depiction of the object (e.g. the numbers found on a ruler or clock).
All photographs were exported from Photoshop in “.png” format in both their original colour and in greyscale (by setting saturation levels to 0). Final edits were completed in Adobe Lightroom (Classic, 8.2 Release): exposure (brightness) adjustments were made on images that appeared too light or too dark; highlights were decreased if some areas were too bright compared to the rest of the photograph; shadows were raised if some areas were too dark compared to the rest of the photograph; noise reduction was applied to some items after isolating the subject had inadvertently made unwanted noise/grain more visible. The changes made to each image were systematically applied to both the colour and greyscale versions (e.g. if one variation of “shoe” had an exposure increase of .010 for the colour version, the greyscale version also received an exposure increase of .010). Some colour-specific adjustments were made to the colour photographs only, however; common photo artefacts such as chromatic aberration (purple fringing) were corrected, along with white balance normalisation. Finally, all photographs were placed on a 600x600 pixel white background, and made to fill this frame as much as possible (i.e. some items were restrained by height, whilst others were restrained by width).
Figure 6: Examples of background and text removal in photograph items.
This was a descriptive study; a mix of qualitative and quantitative data were gathered. Across three blocks, all participants provided five types of response toward photograph stimuli: i) Naming; ii) Familiarity; iii) Visual Complexity, iv) Colour Diagnosticity; and v) Mental Imagery Agreement. Excluding the Naming task (consisting of a typed single-word answer), all responses were provided on a 5-point ordinal scale. Within participants, the maximum number of response type provided for any one item was two; Naming and Familiarity responses were paired in one block, Visual Complexity and Colour Diagnosticity responses were paired in another, and Mental Imagery Agreement responses were always presented in a separate block. The order of these three blocks was counterbalanced across participants. Toward each individual photograph, participants made only one or two types of response before moving on to the next item, and the same items were not repeated to participants. For each photograph, the five types of required data were obtained by counterbalancing between participants (e.g. for the first variation of the “cat” photograph, the Naming and Familiarity data was obtained from one participant, the Visual Complexity and Colour Diagnosticity data was obtained from another, and the Mental Imagery Agreement data was obtained from another).
Data collection was conducted via two online platforms; i) - a survey platform that allowed for straightforward collection of consent, demographics, and computer compatibility data, and ii) - an open-source experiment hosting platform for studies programmed in Javascript (Peirce et al., 2019).
In the Naming and Familiarity block, participants were first asked “What is the name of the item depicted?”. Subjects were instructed to name each photograph as briefly and unambiguously as possible, with one name only, and respond by typing their answer into the response box. If they did not know the name of an item, or had a tip-of-the-tongue experience, participants were instructed to type “no” for their answer (the term “don’t know” was avoided so as not to encourage subjects to deviate from single-word responses, as instructed).Following the naming judgement, with the same photograph still present on-screen, participants were next asked “How familiar is the item depicted?”. Subjects were instructed to judge each photo according to how usual or unusual the item was in their realm of experience; specifically, familiarity was defined as “the degree to which you come in contact with, or think about, the concept”, and encouraged participants to rate the concept itself rather than the particular way it was currently shown. Participants selected one value from the 5-point scale, ranging from very unfamiliar (1) to very familiar (5), and were encouraged to use the full range of the scale throughout the set of photographs.
In the Visual Complexity and Colour Diagnosticity block, participants were first instructed to respond to the question “How visually complex is this picture?” using a 5-point scale that ranged from “very simple” (1) to “very complex” (5). Complexity was defined to subjects as “the amount of detail in the picture”; in contrast to the familiarity ratings, participants were encouraged here to rate the complexity of the picture itself, rather than the real-life item. If the photograph shown was greyscale, subjects would simply move on to the next item. If the item shown was in colour, however, participants were also required to make a colour diagnosticity judgement. This concept was defined as “how typical / normal the colour of the item is”, instructing subjects to rate on a 5-point scale ranging from “Not at all diagnostic (i.e. this item could be in any other colour equally well)” (1) to "Highly diagnostic (i.e. this item appears only in this colour in real life). Participants were instructed to utilise the full range of options on the scale when making visual complexity and colour diagnosticity judgements. After making these ratings, a fixation cross was presented during a 1s interstimulus interval.
Due to the slight change in procedure and increased task complexity, Mental Imagery Agreement ratings were always acquired in an individual block (i.e. not alongside any other response types). First, participants were presented with a written label for 3s (e.g. “cat”) and told to focus their attention on the word. Once the written word disappeared, a beep tone was played alongside the instruction “close your eyes and imagine this item” (subjects were encouraged to close their eyes and begin imagining the item as soon as they heard the tone, but the written instruction were included as a further prompt). After 3s a second beep tone sounded to alert subjects to open their eyes, where they were presented with a photograph of the item they had been instructed to imagine. On a 5-point scale, participants were asked to “rate the agreement between your mental image and the picture”, from “low agreement” (1) to “high agreement” (5). The degree of agreement was defined as “how similar your mental image of the item is to the picture shown”. A fixation cross was displayed for 1s before the next word item was shown.
All responses were self-paced; the timing was only controlled during the study/imagine section of the Mental Imagery Agreement block.
Figure 7: Data collection procedure for Mental Imagery Agreement responses.
The naming responses for each photograph item were manually assessed for spelling and typing errors. Automatic spell checking software was avoided in an effort to avoid inadvertently introducing unique names that were not actually given by participants. The vast majority of errors were unambiguous and easy to correct (e.g. “anker” = “anchor”, “peguin” = “penguin”, “ssnowman” = “snowman”), or consisted of transforming plural words to singular (or vice versa, depending on the form of the intended label - e.g. “sock” to “socks”). Some responses were a little more ambiguous, and necessitated comparison to the photographs they were in response to for additional clarity (e.g. a photograph depicting a plug that would fit into North American electrical sockets was labelled as “usplug” - given the nature of our UK-based sample, it’s likely the subject was responding: “U.S. (i.e. United States) plug”.
There were instances where subjects provided a sensible and correctly spelled English word, but that were clearly typos when examined against the photograph they were in response to (e.g. “dock” for a photograph depicting a duck, “frock” for a frog, and “beer” for a “bear”, etc). The most ambiguous spelling error to correct was “bittle”, which was provided by more than one participant and to more than one item; separate inspections of the photographs participants were responding to made this easy to correct though, with one participant clearly meaning to respond “bottle”, whilst the other meant to respond “beetle”. Though participants were instructed to only give a single label for each item, some multiple word responses were found (without spaces) during the spell checking process. On such occasions, a judgement was made regarding whether multiple words were retained, or whether the response could be shortened into a single word. A general rule was applied whereby if the other words provided additional information, they were retained (e.g.“maledear” - presumably “male deer” - was kept as a two-word answer). Multiple word responses were generally shortened into a single word when the intended label for the item was clearly present, and no information was lost in the process (e.g. “haircomb” was shortened to the intended answer “comb”). It is noted that there was some inherent subjectivity in this process, though as such items were not common among straightforward responses, their overall effects are estimated to be negligible.
Finally, there were some responses that were changed to “no” as they were clearly intended to signify that the responder did not know the name of the item shown; the experiment instructed participants to type “no” in these instances, though the labels “none” and “idk” (common abbreviation for “I don’t know) were provided instead. There was also a single response that was manually changed to”no“, as the provided label was a single letter and thus entirely unclear what the intended answer should be (see Appendix A for full list of manipulations to naming responses). This process yielded data that could be used to determine which photograph variation best matched the intended concepts (e.g. 100% of participants labelled the object “bottle”, indicating a perfect match), and which did not (e.g. only 50% of participants labelled the item “bottle”, whilst the other 50% gave the label “wine”, indicating a poor match). Photographs showing poor agreement across participant-generated labels, or those where the majority of labels differed from the intended concept, could be replaced with the variation demonstrating the most accurate depiction.
A number of variables were calculated prior to analysis. For familiarity, visual complexity, colour diagnosticity, and mental imagery agreement, mean ratings were calculated for each (see Appendix B). Mean reaction times (RTs) were also calculated for each photograph / response variable, including naming responses. For naming responses, accuracy was defined as the proportion of subjects reporting the correct/intended label for any given item (e.g. 80% of subjects correctly labelled a photograph of the moon as “moon”). Percentage agreement was also calculated (i.e. the proportion of subjects providing the most frequent name, regardless of whether it matched the correct/intended label) in order to compute H values for each item. The H statistic also reflects naming agreement, but it takes into account the total number of unique labels given for an item. This is especially useful for comparing similar items, as it captures information not provided by simple agreement proportions. For instance, if the first variation of the photo moon (‘moon-1’) demonstrated 90% naming agreement among subjects, and the second variation (‘moon-2’) also demonstrated 90% naming agreement, it would appear as if both versions offer the same level of agreement among participants. However, ‘moon-1’ may have received a total of 2 unique names (e.g. moon, planet), while ‘moon-2’ received a total of 4 unique names (e.g. moon, planet, earth, comet). H values utilise this useful information to determine which item shows the best naming agreement (in other words, the item with the least number of unique names). The original formula by Snodgrass & Vanderwart (1980) was used to calculate H values:
A H value of 0 indicates perfect naming agreement (all subjects responded with the same label for that item). Items showing a H value of 1 signify two unique names were provided, with identical proportions (e.g. 10 subjects responded “moon” and 10 subjects responded “planet”). As the H value increases, overall naming agreement decreases.
Summary statistics (mean and SD) for each of the measured variables are shown in Table 4. Data for the grey and colour photographs are presented alongside previously obtained normative values for a number of other stimuli formats (all obtained from Rossion & Pourtois (2004), who published revised norms for Snodgrass & Vanderwart (1980)’s (S&V) original line drawings, as well as their own re-drawn versions that contained shading and texture detail). The data from previous studies were not used in any statistical analyses. To examine whether the grey and colour photographs from the current study demonstrated any differences, a series of independent samples t-tests were run on each variable, as well as their corresponding reaction times (excluding scores of colour diagnosticity, which were obtained only in response to the colour items and thus cannot be compared). Mean (and SD) values for all x816 unique photograph items are presented in Appendix B.
Naming accuracy was very high for all photographs (M = 0.95), indicating that overall, the selected items closely depicted the intended concepts. Compared with the other stimuli formats, there appears to be a steady increase in accuracy as items become more distinctive (see Table 4). Accuracy rates did not differ between the grey (M = 0.94) and colour (M = 0.95) versions of the photographs [\(t(749.02) = -0.44\), \(p = .660\)].
H values were also low across all items (M = 0.23), showing that subjects generally agreed on how the items should be named. Similar to naming accuracy, naming agreement also appears to steadily increase as items become more distinctive (as indicated by decreasing H values - see Table 4. While Rossion & Pourtois (2004) observed significantly better naming agreement for their colour - rather than greyscale - items, this pattern did not reach significance with the current set of photographs; H values did not differ between the grey (M = 0.24) and colour (M = 0.23) photographs [\(t(747.06) = 0.55\), \(p = .581\)].
A mean reaction time (RT) of (3.9s) was observed for naming responses. While this was of little interest on its own, and could not be compared to those obtained in response to the other stimuli formats as our methodology was slightly different (RTs were only recorded when subjects had typed their response and clicked the mouse to signify they had finished), they were useful for marking comparisons between the grey and colour items (though no difference was observed [M grey = 3.99s, M colour = 3.81s, \(t(666.35) = 1.46\), \(p = .144\)]). Overall, these analyses suggest that the current photographs closely resemble the drawings they were designed to match, with high levels of naming accuracy and agreement among subjects. The absence of any colour differences indicates there were no naming advantages when photographs were made even more distinctive through the addition of colour.
Table 4: Summary statistics for each of the measured variables. Mean values are presented in bold (SDs are shown in parentheses).
Scores of mental imagery agreement were moderate across all items (M = 3.6). While no colour differences were previously observed between stimuli formats, the grey (M = 3.46) photographs in the current study showed significantly lower mental imagery agreement scores than the colour (M = 3.74) items [\(t(797.29) = -6.47\), \(p < .001\)]. Comparisons with previous normative data also highlight how the grey photographs exhibited uniquely poorer mental imagery agreement scores than any of the other stimuli formats (see Table 4). RTs between the grey (M = 3.05) and colour (M = 2.84) items did not significantly differ [\(t(574.37) = 1.96\), \(p = .051\)].
Familiarity scores were high overall (M = 4.18), and like previous findings, there was no difference between the grey (M = 4.15) and colour (M = 4.21) items [\(t(812.90) = -1.59\), \(p = .113\)]. However, familiarity scores for the current set of photographs were higher than those obtained for any of the other stimuli formats, and while there previously appeared to be a decline in familiarity as stimuli become more distinctive (from line drawings, to grey shaded, to colour shaded), such a pattern was not evident with the current photographs (see Table 4). RTs between the grey (M = 0.98) and colour (M = 0.99) items did not significantly differ [\(t(783.06) = -0.32\), \(p = .747\)].
Visual complexity ratings were moderate across all of the items (M = 3.35). Colour (M = 3.15) photographs showed significantly higher scores of visual complexity than grey (M = 2.86) photographs [\(t(813.64) = -6.51\), \(p < .001\)]. This finding is further demonstrated when compared to the scores from the other stimuli formats (see Table 4); where grey photographs show comparable levels of visual complexity, the colour photographs show higher scores than all of the other formats. There was no significant difference between the RTs of grey (M = 3.31) and colour (M = 3.39) items [\(t(739.22) = -1.07\), \(p = .287\)].
For each concept represented in the photographs, one variation (e.g. shoe-1, shoe-2, or shoe-3) was selected for inclusion in a final list of stimuli that would be taken forward into subsequent recognition experiments. The normative naming data was assessed to establish which version best matched the existing line-drawn depictions of the concepts (Rossion & Pourtois, 2004). Naming was favoured over all of the other variables as, if an item was found to primarily convey a different concept than was intended during the naming task (e.g. if a photograph of the fruit ‘orange’ was labelled ‘grapefruit’ by the majority of subjects), then it could not be sufficiently compared to its line-drawn (and written-word) counterpart during recognition studies.
At least 20 unique naming responses were collected for each of the 816 photographs (408 grey items and 408 colour items). The proportion of ‘correct’ responses (i.e. names that were congruent with the intended concept) and the proportion of ‘don’t know’ responses were calculated for each item. Photographs were excluded if they:
54 photographs were found to meet at least one of the above criteria, and therefore excluded. Regardless of whether these items were grey or colour, it was also necessary to remove its grey or colour partner (since both versions were needed to make comparisons across recognition experiments). Thus, a total of 64 items (32 grey / 32 colour) were excluded at this stage (many items already had both grey and colour versions flagged by the original criteria).
Next, the proportion of correct responses were compared between grey and colour photographs in order to identify items showing the lowest difference. In order to manipulate colour in later recognition experiments, it was important to select items where naming was congruent across colour/grey items; in other words, it would be difficult to attribute particular recognition response patterns to the addition of colour (if a difference were found) when the grey version could not be identified (or encoded) similarly. Variations exhibiting the least difference between colour and grey items (for the proportion of correct responses) were taken forward, while the rest were excluded. In a number of instances, multiple variations for the same object had the same ‘difference’ score. For example, all three variations of the item “balloon” exhibited perfect naming agreement, irrespective of whether they were presented in colour or grey (and thus “balloon1”, “balloon2”, and “balloon3” had a difference score of 0). For items where more than 1 variation remained, manual rankings were obtained from two of the researchers to determine which variation best depicted the intended concept. For each item, the researchers independently studied the remaining variations and provided a rank of which they thought was best (1) to worst (2 or 3, depending on the number of variations that remained). The ratings from both researchers were collated; items where there was agreement as to which variation best depicted the intended concept were selected for inclusion in the final stimuli list. For all the items where there was disagreement between the researchers rankings, one of the variations was simply selected at random.
For naming responses (accuracy, agreement [H], and RTs), no differences were observed between the grey and colour photographs. Such a result was expected for accuracy and agreement scores; the addition/absence of colour should not alter how participants identify (and thus label) items, except in rare instances whereby a lack of colour may lead to the misidentification of an object (e.g. incorrectly labelling a greyscale photograph of an orange as ‘grapefruit’). The data indicates, however, that this was not common, with the grey set of photographs exhibiting equally high levels of naming accuracy as the colour photographs. The absence of RT differences between the colour and greyscale sets was not expected for naming responses. It is reasonable to assume that colour photographs - with an additional layer of contextual information compared to grey items - would be identified (and therefore named) quicker than grey photographs (e.g. a colour photograph of an orange should avoid the potential ambiguity that might accompany a greyscale depiction, which could initially be confused for another type of fruit). Indeed, Rossion & Pourtois (2004) demonstrated RTs consistent with this hypotheses, with colour drawings showing significantly quicker RTs than grey items. The lack of difference in the current data could be attributable to ceiling effects, whereby all photographs were sufficiently unambiguous, and were quickly identified irrespective of whether they were presented in greyscale or colour. Examination of the other naming data, showing similarly high levels of accuracy and agreement across grey and colour, supports this notion.
Scores of mental imagery agreement produced particularly interesting results between the grey and colour items. Grey photographs exhibited a significantly poorer match with subjects imagined presentation of the objects than the colour items. Colour differences were not observed previously between drawings (Rossion & Pourtois, 2004), and comparing the current data with that obtained in other studies (see Table 4) demonstrates how the greyscale photographs show uniquely lower mental imagery agreement scores compared with any of the other stimuli formats. To imagine the objects, it seems likely that subjects would conjure an image of how they naturally see the item in their everyday lives - which for the majority of subjects, would presumably be a colour representation. Therefore, when presented with greyscale depictions, subjects may have been more inclined to report that that item did not align quite as well as those presented in colour. However, it is unclear why a similar pattern is not also evident when comparing grey and colour drawings (Rossion & Pourtois, 2004). It may be that photographs promote stricter internal criteria when subjects must decide whether an item is a good match to their mental image. With line-drawn / illustrated items, subjects may simply accept that the items are baseline depictions, and that they will only able to match their real-world mental images to a certain degree - thus leading to a generally more liberal response bias throughout. The addition of colour may therefore do very little to further reconcile the match between the drawing and real-world mental representation. When subjects are responding only to photographs, the ecological nature of the items may facilitate deeper critical evaluation of whether they offer a good match to mental images, and thus promote a more conservative response bias. Colour may therefore be a far more important factor in photographs than it is in line drawings for allowing participants to decide whether an item matches well with their mental image.
There were no colour differences in familiarity scores. This result was expected - participants were asked to rate the degree to which they came in contact with, or think about, the concept itself rather than the particular depiction shown, and there is no apparent reason why colour should influence such ratings. Visual complexity, on the other hand, where participants were required to directly rate the amount of detail in the picture, did show an expected difference. Colour photographs were rated as significantly more visually complex than grey items, presumably due to their additional layer of contextual information. When compared to the previous data obtained for drawings, the greyscale photographs showed comparable levels of visual complexity, while the colour photographs showed higher levels than any of the other formats. It is unclear why the photographs of the current study showed colour differences, when grey and colour drawings did not differ, though it may tie in with the hypotheses proposed to explain the mental imagery agreement data. Subjects may apply stricter internal criteria when rating stimuli that are perceived as being closer to how they would be experienced in real life - when viewing a colour photograph of a rabbit, it is difficult to see how we could make the item any more visually complex than it already is (at least in a 2D medium). It’s probable that subjects notice the absence of colour when viewing the greyscale items, since they depict the items in a way that they are not usually seen, and thus determine that these items could be made more complex if they were shown in colour (and so give lower visual complexity ratings as a result).
The objective of the current study was to establish a new set of ecological photograph stimuli to be taken forward into subsequent recognition memory experiments. Matching items with previously established drawings (and words) would allow for the effects of stimuli-format on recognition response patterns to be directly examined. A range of normative data was collected for 816 unique photograph items. These items may prove useful for a range of cognitive researchers that wish to utilise a set of high quality and realistic object stimuli, especially given the flexibility of items that can be filtered based on colour, naming agreement, familiarity, etc. For the needs of the current body of research, the naming data was used to determine which photographs best matched the intended concepts among a number of possible variations. This allowed for the systematic comparison of recognition memory performance toward three distinct stimuli formats (words, drawings, and photographs) in the following study, in an effort to establish how stimuli of varying perceptual distinctiveness may affect recognition response patterns. Such comparisons might help to reconcile the inconsistencies present across recognition memory research, such as those attempted to determine whether familiarity processes are preserved in those with amnestic Mild Cognitive Impairment (aMCI).
Adlington, R. L., Laws, K. R., & Gale, T. M. (2009). The Hatfield Image Test (HIT): A new picture test and norms for experimental and clinical use. Journal of Clinical and Experimental Neuropsychology, 31(6), 731–753. https://doi.org/10.1080/13803390802488103
Algarabel, S., Fuentes, M., Escudero, J., Pitarque, A., Peset, V., Mazón, J.-F., & Meléndez, J.-C. (2012). Recognition memory deficits in mild cognitive impairment. Aging, Neuropsychology, and Cognition, 19(5), 608–619. https://doi.org/10.1080/13825585.2011.640657
Ally, B. A. (2012). Using Pictures and Words To Understand Recognition Memory Deterioration in Amnestic Mild Cognitive Impairment and Alzheimer’s Disease: A Review. Current Neurology and Neuroscience Reports, 12(6), 687–694. https://doi.org/10.1007/s11910-012-0310-7
Ally, B. A., Gold, C. A., & Budson, A. E. (2009). The picture superiority effect in patients with Alzheimer’s disease and mild cognitive impairment. Neuropsychologia, 47(2), 595–598. https://doi.org/10.1016/j.neuropsychologia.2008.10.010
Barba, G. D. (1997). Recognition Memory and Recollective Experience in Alzheimer’s Disease. Memory, 5(6), 657–672. https://doi.org/10.1080/741941546
Belleville, S. (2011). Impact of novelty and type of material on recognition in healthy older adults and persons with mild cognitive impairment. 10.
Bermúdez-Margaretto, B., Beltrán, D., Cuetos, F., & Domínguez, A. (2018). Brain Signatures of New (Pseudo-) Words: Visual Repetition in Associative and Non-associative Contexts. Frontiers in Human Neuroscience, 12, 354. https://doi.org/10.3389/fnhum.2018.00354
Biederman, I. (1987). Recognition-by-Components: A Theory of Human Image Understanding. 33.
Bowen, H. J., Fields, E. C., & Kensinger, E. A. (2019). Prior Emotional Context Modulates Early Event-Related Potentials to Neutral Retrieval Cues. Journal of Cognitive Neuroscience, 31(11), 1755–1767. https://doi.org/10.1162/jocn_a_01451
Cui, L., Shi, G., He, F., Zhang, Q., Oei, T. P. S., & Guo, C. (2016). Electrophysiological Correlates of Emotional Source Memory in High-Trait-Anxiety Individuals. Frontiers in Psychology, 7. https://doi.org/10.3389/fpsyg.2016.01039
Deason, R. G., Hussey, E. P., Flannery, S., & Ally, B. A. (2015). Preserved conceptual implicit memory for pictures in patients with Alzheimer’s disease. Brain and Cognition, 99, 112–117. https://doi.org/10.1016/j.bandc.2015.07.008
Dewhurst, S. A., & Conway, M. A. (1994). Pictures, Images, and Recollective Experience. 11.
Embree, L. M., Budson, A. E., & Ally, B. A. (2012). Memorial familiarity remains intact for pictures but not for words in patients with amnestic mild cognitive impairment. Neuropsychologia, 50(9), 2333–2340. https://doi.org/10.1016/j.neuropsychologia.2012.06.001
Ensor, T. M., Bancroft, T. D., & Hockley, W. E. (2019). Listening to the Picture-Superiority Effect. Experimental Psychology, 20.
Herzmann, G., Minor, G., & Curran, T. (2018). Neural evidence for the contribution of holistic processing but not attention allocation to the other-race effect on face memory. Cognitive, Affective, & Behavioral Neuroscience, 18(5), 1015–1033. https://doi.org/10.3758/s13415-018-0619-z
Hockley, W. E. (2008). The picture superiority effect in associative recognition. Memory & Cognition, 36(7), 1351–1359. https://doi.org/10.3758/MC.36.7.1351
Koen, J. D., & Yonelinas, A. P. (2014). The Effects of Healthy Aging, Amnestic Mild Cognitive Impairment, and Alzheimer’s Disease on Recollection and Familiarity: A Meta-Analytic Review. 41.
Martins, C. A. R., & Lloyd-Jones, T. J. (2006). Preserved Conceptual Priming in Alzheimer’s Disease. Cortex, 42(7), 995–1004. https://doi.org/10.1016/S0010-9452(08)70205-3
McBride, D. M., & Anne Dosher, B. (2002). A comparison of conscious and automatic memory processes for picture and word stimuli: A process dissociation analysis. Consciousness and Cognition, 11(3), 423–460. https://doi.org/10.1016/S1053-8100(02)00007-7
Meade, M. E., Ahmad, M., & Fernandes, M. A. (2019). Drawing pictures at encoding enhances memory in healthy older adults and in individuals with probable dementia. Aging, Neuropsychology, and Cognition, 27(6), 880–901. https://doi.org/10.1080/13825585.2019.1700899
Migo, E. M., Mayes, A. R., & Montaldi, D. (2012). Measuring recollection and familiarity: Improving the remember/know procedure. Consciousness and Cognition, 21(3), 1435–1455. https://doi.org/10.1016/j.concog.2012.04.014
Moreno-Martínez, F. J., & Montoro, P. R. (2012). An Ecological Alternative to Snodgrass & Vanderwart: 360 High Quality Colour Images with Norms for Seven Psycholinguistic Variables. PLoS ONE, 7(5), e37527. https://doi.org/10.1371/journal.pone.0037527
Peirce, J., Gray, J. R., Simpson, S., MacAskill, M., Höchenberger, R., Sogo, H., Kastman, E., & Lindeløv, J. K. (2019). PsychoPy2: Experiments in behavior made easy. Behavior Research Methods, 51(1), 195–203. https://doi.org/10.3758/s13428-018-01193-y
Pitarque, A. (2016). The effects of healthy aging, amnestic mild cognitive impairment, and Alzheimer’s disease on recollection, familiarity and false recognition, estimated by an associative process-dissociation recognition procedure. 7.
Rajaram, S. (1996). Perceptual Effects on Remembering: Recollective Processes in Picture Recognition Memory. 13.
Rajaram, S. (1993). Remembering and knowing: Two means of access to the personal past. Memory & Cognition, 21(1), 89–102. https://doi.org/10.3758/BF03211168
Rollins, L., & Riggins, T. (2018). Age-related differences in subjective recollection: ERP studies of encoding and retrieval. Developmental Science, 21(3), e12583. https://doi.org/10.1111/desc.12583
Rossion, B., & Pourtois, G. (2004). Revisiting Snodgrass and Vanderwart’s Object Pictorial Set: The Role of Surface Detail in Basic-Level Object Recognition. Perception, 33(2), 217–236. https://doi.org/10.1068/p5117
Sanfeliu, M. C., & Fernandez, A. (1996). A set of 254 Snodgrass-Vanderwart pictures standardized for Spanish: Norms for name agreement, image agreement, familiarity, and visual complexity. Behavior Research Methods, Instruments, & Computers, 28(4), 537–555. https://doi.org/10.3758/BF03200541
Schmitter-Edgecombe, M., Woo, E., & Greeley, D. R. (2009). Characterizing multiple memory deficits and their relation to everyday functioning in individuals with mild cognitive impairment. Neuropsychology, 23(2), 168–177. https://doi.org/10.1037/a0014186
Schoemaker, D., Gauthier, S., & Pruessner, J. C. (2014). Recollection and Familiarity in Aging Individuals with Mild Cognitive Impairment and Alzheimer’s Disease: A Literature Review. Neuropsychol Rev, 19.
Snodgrass, J. G., & Vanderwart, M. (1980). A Standardized Set of 260 Pictures: Norms for Name Agreement, Image Agreement, Familiarity, and Visual Complexity. Journal of Experimental Psychology: Human Learning and Memory, 6(2), 174–215.
Stenberg, G. (2006). Conceptual and perceptual factors in the picture superiority effect. European Journal of Cognitive Psychology, 18(6), 813–847. https://doi.org/10.1080/09541440500412361
Székely, A., D’Amico, S., Devescovi, A., Federmeier, K., Herron, D., Iyer, G., Jacobsen, T., & Bates, E. (2003). Timed picture naming: Extended norms and validation against previous studies. Behavior Research Methods, Instruments, & Computers, 35(4), 621–633. https://doi.org/10.3758/BF03195542
Tanaka, J., Weiskopf, D., & Williams, P. (2001). The role of color in high-level vision. TRENDS in Cognitive Sciences, 5(5), 211–215.
Tarr, M. J., & Bülthoff, H. H. (1998). Image-based object recognition in man, monkey and machine. Cognition, 67(1-20).
Troyer, A. K., Murphy, K. J., Anderson, N. D., Craik, F. I. M., Moscovitch, M., Maione, A., & Gao, F. (2012). Associative recognition in mild cognitive impairment: Relationship to hippocampal volume and apolipoprotein E. Neuropsychologia, 50(14), 3721–3728. https://doi.org/10.1016/j.neuropsychologia.2012.10.018
Troyer, A. K., Vandermorris, S., & Murphy, K. J. (2016). Intraindividual variability in performance on associative memory tasks is elevated in amnestic mild cognitive impairment. Neuropsychologia, 90, 110–116. https://doi.org/10.1016/j.neuropsychologia.2016.06.011
Tsaparina, D., Bonin, P., & Méot, A. (2011). Russian norms for name agreement, image agreement for the colorized version of the Snodgrass and Vanderwart pictures and age of acquisition, conceptual familiarity, and imageability scores for modal object names. Behavior Research Methods, 43(4), 1085–1099. https://doi.org/10.3758/s13428-011-0121-9
van der Meulen, M., Lederrey, C., Rieger, S. W., van Assche, M., Schwartz, S., Vuilleumier, P., & Assal, F. (2012). Associative and Semantic Memory Deficits in Amnestic Mild Cognitive Impairment as Revealed by Functional Magnetic Resonance Imaging: Cognitive and Behavioral Neurology, 25(4), 195–215. https://doi.org/10.1097/WNN.0b013e31827de67f
Viggiano, M. P., Vannucci, M., & Righi, S. (2004). A New Standardized Set of Ecological Pictures for Experimental and Clinical Research on Visual Object Processing. Cortex, 40(3), 491–509. https://doi.org/10.1016/S0010-9452(08)70142-4
Wagner, A. D., Gabrieli, J. D. E., & Verfaellie, M. (1997). Dissociations Between Familiarity Processes in Explicit Recognition and Implicit Perceptual Memory. 19.
Wammes, J. D., Meade, M. E., & Fernandes, M. A. (2016). The drawing effect: Evidence for reliable and robust memory benefits in free recall. Quarterly Journal of Experimental Psychology, 69(9), 1752–1776. https://doi.org/10.1080/17470218.2015.1094494
Wang, P., Li, J., Li, H., Li, B., Yang Jiang, Bao, F., & Zhang, S. (2013). Is emotional memory enhancement preserved in amnestic mild cognitive impairment? Evidence from separating recollection and familiarity. Neuropsychology, 27(6), 691–701. https://doi.org/10.1037/a0033973
Weldon, M. S., Iii, H. L. R., & Challis, B. H. (1989). The properties of retrieval cues constrain the picture superiority effect. 11.
Weldon, M. S., & Roediger, H. L. (1987). Altering retrieval demands reverses the picture superiority effect. 12.
Westerberg, C., Mayes, A., Florczak, S. M., Chen, Y., Creery, J., Parrish, T., Weintraub, S., Mesulam, M.-M., Reber, P. J., & Paller, K. A. (2013). Distinct medial temporal contributions to different forms of recognition in amnestic mild cognitive impairment and Alzheimer’s disease. Neuropsychologia, 51(12), 2450–2461. https://doi.org/10.1016/j.neuropsychologia.2013.06.025
Whitehouse, A. J. O., Maybery, M. T., & Durkin, K. (2006). The development of the picture-superiority effect. British Journal of Developmental Psychology, 24(4), 767–773. https://doi.org/10.1348/026151005X74153
Wolk, D. A., Dunfee, K. L., Dickerson, B. C., Aizenstein, H. J., & DeKosky, S. T. (2011). A medial temporal lobe division of labor: Insights from memory in aging and early Alzheimer disease. Hippocampus, 21(5), 461–466. https://doi.org/10.1002/hipo.20779
Wolk, D. A., Mancuso, L., Kliot, D., Arnold, S. E., & Dickerson, B. C. (2013). Familiarity-based memory as an early cognitive marker of preclinical and prodromal AD. Neuropsychologia, 51(6), 1094–1102. https://doi.org/10.1016/j.neuropsychologia.2013.02.014
Wolk, D. A., Signoff, E. D., & DeKosky, S. T. (2008). Recollection and familiarity in amnestic mild cognitive impairment: A global decline in recognition memory. Neuropsychologia, 46(7), 1965–1978. https://doi.org/10.1016/j.neuropsychologia.2008.01.017
Yonelinas, A. P. (2002). The Nature of Recollection and Familiarity: A Review of 30 Years of Research. Journal of Memory and Language, 46(3), 441–517. https://doi.org/10.1006/jmla.2002.2864
Yoon, C., Feinberg, F., Luo, T., Hedden, T., Gutchess, A. H., Chen, H.-Y. M., Mikels, J. A., Jiao, S., & Park, D. C. (2004). A cross-culturally standardized set of pictures for younger and older adults: American and Chinese norms for name agreement, concept agreement, and familiarity. Behavior Research Methods, Instruments, & Computers, 36(4), 639–649. https://doi.org/10.3758/BF03206545
Table 5: Spelling corrections / manipulations to naming responses.
Table 6: Normative data for all photograph items.
Error in gsub("[0-9].*", "", norms$Photograph) : object 'norms' not found